Q1: What is the trend in cases, mortality across geopgraphical regions?
Plot # of cases vs time
* For each geographical set:
* comparative longitudinal case trend (absolute & log scale)
* comparative longitudinal mortality trend
* death vs total correlation
| comparative_longitudinal_case_trend |
long |
time |
log_cases |
geography |
none (case type?) |
case_type |
[15, 50, 4] geography x (2 scale?) case type |
| comparative longitudinal case trend |
long |
time |
cases |
geography |
case_type |
? |
[15, 50, 4] geography x (2+ scale) case type |
| comparative longitudinal mortality trend |
wide |
time |
mortality rate |
geography |
none |
none |
[15, 50, 4] geography |
| death vs total correlation |
wide |
cases |
deaths |
geography |
none |
none |
[15, 50, 4] geography |
# total cases vs time
# death cases vs time
# mortality rate vs time
# death vs mortality
# death vs mortality
# total & death case vs time (same plot)
#<question> <x> <y> <colored> <facet> <dataset>
## trend in case/deaths over time, comapred across regions <time> <log cases> <geography*> <none> <.wide>
## trend in case/deaths over time, comapred across regions <time> <cases> <geography*> <case_type> <.long>
## trend in mortality rate over time, comapred across regions <time> <mortality rate> <geography*> <none>
## how are death/mortality related/correlated? <time> <log cases> <geography*> <none>
## how are death and case load correlated? <cases> <deaths>
# lm for each?? - > apply lm from each region starting from 100th case. m, b associated with each.
# input: geographical regsion, logcase vs day (100th case)
# output: m, b for each geographical region ID
#total/death on same plot- diffeer by 2 logs, so when plotting log, use pch. when plotting absolute, need to use free scales
#when plotting death and case on same, melt.
#CoronaCases - > filter sets (3)
#world - choose countries with sufficent data
N<-ddply(filter(Corona_Cases,Total_confirmed_cases>100),c("Country.Region"),summarise,n=length(Country.Region))
ggplot(filter(N,n<100),aes(x=n))+
geom_histogram()+
default_theme+
ggtitle("Distribution of number of days with at least 100 confirmed cases for each region")
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

kable(arrange(N,-n),caption="Sorted number of days with at least 100 confirmed cases")
Sorted number of days with at least 100 confirmed cases
| US_state |
9348 |
| China |
90 |
| Diamond Princess |
71 |
| Korea, South |
61 |
| Japan |
60 |
| Italy |
58 |
| Iran |
55 |
| Singapore |
52 |
| France |
51 |
| Germany |
51 |
| Spain |
50 |
| US |
49 |
| Switzerland |
47 |
| United Kingdom |
47 |
| Belgium |
46 |
| Netherlands |
46 |
| Norway |
46 |
| Sweden |
46 |
| Austria |
44 |
| Malaysia |
43 |
| Australia |
42 |
| Bahrain |
42 |
| Denmark |
42 |
| Canada |
41 |
| Qatar |
41 |
| Iceland |
40 |
| Brazil |
39 |
| Czechia |
39 |
| Finland |
39 |
| Greece |
39 |
| Iraq |
39 |
| Israel |
39 |
| Portugal |
39 |
| Slovenia |
39 |
| Egypt |
38 |
| Estonia |
38 |
| India |
38 |
| Ireland |
38 |
| Kuwait |
38 |
| Philippines |
38 |
| Poland |
38 |
| Romania |
38 |
| Saudi Arabia |
38 |
| Indonesia |
37 |
| Lebanon |
37 |
| San Marino |
37 |
| Thailand |
37 |
| Chile |
36 |
| Pakistan |
36 |
| Luxembourg |
35 |
| Peru |
35 |
| Russia |
35 |
| Ecuador |
34 |
| Slovakia |
34 |
| South Africa |
34 |
| United Arab Emirates |
34 |
| Armenia |
33 |
| Colombia |
33 |
| Croatia |
33 |
| Mexico |
33 |
| Panama |
33 |
| Serbia |
33 |
| Taiwan* |
33 |
| Turkey |
33 |
| Argentina |
32 |
| Bulgaria |
32 |
| Latvia |
32 |
| Algeria |
31 |
| Costa Rica |
31 |
| Dominican Republic |
31 |
| Hungary |
31 |
| Uruguay |
31 |
| Andorra |
30 |
| Bosnia and Herzegovina |
30 |
| Jordan |
30 |
| Lithuania |
30 |
| Morocco |
30 |
| New Zealand |
30 |
| North Macedonia |
30 |
| Vietnam |
30 |
| Albania |
29 |
| Cyprus |
29 |
| Malta |
29 |
| Moldova |
29 |
| Brunei |
28 |
| Burkina Faso |
28 |
| Sri Lanka |
28 |
| Tunisia |
28 |
| Ukraine |
27 |
| Azerbaijan |
26 |
| Ghana |
26 |
| Kazakhstan |
26 |
| Oman |
26 |
| Senegal |
26 |
| Venezuela |
26 |
| Afghanistan |
25 |
| Cote d’Ivoire |
25 |
| Cuba |
24 |
| Mauritius |
24 |
| Uzbekistan |
24 |
| Cambodia |
23 |
| Cameroon |
23 |
| Honduras |
23 |
| Nigeria |
23 |
| West Bank and Gaza |
23 |
| Belarus |
22 |
| Georgia |
22 |
| Bolivia |
21 |
| Kosovo |
21 |
| Kyrgyzstan |
21 |
| Montenegro |
21 |
| Congo (Kinshasa) |
20 |
| Kenya |
19 |
| Niger |
18 |
| Guinea |
17 |
| Rwanda |
17 |
| Trinidad and Tobago |
17 |
| Paraguay |
16 |
| Bangladesh |
15 |
| Djibouti |
13 |
| El Salvador |
12 |
| Guatemala |
11 |
| Madagascar |
10 |
| Mali |
9 |
| Congo (Brazzaville) |
6 |
| Jamaica |
6 |
| Gabon |
4 |
| Somalia |
4 |
| Tanzania |
4 |
| Ethiopia |
3 |
| Burma |
2 |
| Sudan |
1 |
# Pick top 15 countries with data
max_colors<-12
# find way to fix this- China has diff provences. Plot doesnt look right...
sufficient_data<-arrange(filter(N,!Country.Region %in% c("US_state", "Diamond Princess")),-n)[1:max_colors,]
kable(sufficient_data,caption = paste0("Top ",max_colors," countries with sufficient data"))
Top 12 countries with sufficient data
| China |
90 |
| Korea, South |
61 |
| Japan |
60 |
| Italy |
58 |
| Iran |
55 |
| Singapore |
52 |
| France |
51 |
| Germany |
51 |
| Spain |
50 |
| US |
49 |
| Switzerland |
47 |
| United Kingdom |
47 |
Corona_Cases.world<-filter(Corona_Cases,Country.Region %in% c(sufficient_data$Country.Region))
#us
# - by state
Corona_Cases.US<-filter(Corona_Cases,Country.Region=="US" & Total_confirmed_cases>0)
# summarize
#!City %in% c("Unassigned")
# - specific cities
#mortality_rate!=Inf & mortality_rate<=1
Corona_Cases.UScity<-filter(Corona_Cases,Province.State %in% c("Pennsylvania","Maryland","New York","New Jersey") & City %in% c("Bucks","Baltimore City", "New York","Burlington"))
measure_vars_long<-c("Total_confirmed_cases.log","Total_confirmed_cases","Total_confirmed_deaths","Total_confirmed_deaths.log")
melt_arg_list<-list(variable.name = "case_type",value.name = "cases",measure.vars = c("Total_confirmed_cases","Total_confirmed_deaths"))
melt_arg_list$data=NULL
melt_arg_list$data=select(Corona_Cases.world,-ends_with(match = "log"))
Corona_Cases.world.long<-do.call(melt,melt_arg_list)
melt_arg_list$data=select(Corona_Cases.UScity,-ends_with(match = "log"))
Corona_Cases.UScity.long<-do.call(melt,melt_arg_list)
melt_arg_list$data=select(Corona_Cases.US_state,-ends_with(match = "log"))
Corona_Cases.US_state.long<-do.call(melt,melt_arg_list)
Corona_Cases.world.long$cases.log<-log(Corona_Cases.world.long$cases,10)
Corona_Cases.US_state.long$cases.log<-log(Corona_Cases.US_state.long$cases,10)
Corona_Cases.UScity.long$cases.log<-log(Corona_Cases.UScity.long$cases,10)
# what is the current death and total case load for US? For world? For states?
#-absolute
#-log
# what is mortality rate (US, world)
#-absolute
#how is death and case correlated? (US, world)
#-absolute
#Corona_Cases.US<-filter(Corona_Cases,Country.Region=="US" & Total_confirmed_cases>0)
#Corona_Cases.US.case100<-filter(Corona_Cases.US, Days_since_100>=0)
# linear model parameters
#(model_fit<-lm(formula = Total_confirmed_cases.log~Days_since_100,data= Corona_Cases.US.case100 ))
#(slope<-model_fit$coefficients[2])
#(intercept<-model_fit$coefficients[1])
# Correlation coefficient
#cor(x = Corona_Cases.US.case100$Days_since_100,y = Corona_Cases.US.case100$Total_confirmed_cases.log)
##------------------------------------------
## Plot World Data
##------------------------------------------
# Timestamp for world
timestamp_plot.world<-paste("Most recent date for which data available:",max(Corona_Cases.world$Date))#timestamp(quiet = T,prefix = "Updated ",suffix = " (EST)")
# Base template for plots
baseplot.world<-ggplot(data=NULL,aes(x=Days_since_100,col=Country.Region))+
default_theme+
scale_color_brewer(type = "qualitative",palette = "Paired")+
ggtitle(paste("Log10 cases over time,",timestamp_plot.world))+
theme(legend.position = "bottom",plot.title = element_text(size=12))
##/////////////////////////
### Plot Longitudinal cases
(Corona_Cases.world.long.plot<-baseplot.world+
geom_point(data=Corona_Cases.world.long,aes(y=cases))+
geom_line(data=Corona_Cases.world.long,aes(y=cases))+
facet_wrap(~case_type,scales = "free_y",ncol=1)+
ggtitle(timestamp_plot.world)
)

(Corona_Cases.world.loglong.plot<-baseplot.world+
geom_point(data=Corona_Cases.world.long,aes(y=cases.log))+
geom_line(data=Corona_Cases.world.long,aes(y=cases.log))+
facet_wrap(~case_type,scales = "free_y",ncol=1)+
ggtitle(timestamp_plot.world))

##/////////////////////////
### Plot Longitudinal mortality rate
(Corona_Cases.world.mortality.plot<-baseplot.world+
geom_point(data=Corona_Cases.world,aes(y=mortality_rate))+
geom_line(data=Corona_Cases.world,aes(y=mortality_rate))+
ylim(c(0,0.3))+
ggtitle(timestamp_plot.world))
## Warning: Removed 100 rows containing missing values (geom_point).
## Warning: Removed 100 row(s) containing missing values (geom_path).

##/////////////////////////
### Plot death vs total case correlation
(Corona_Cases.world.casecor.plot<-ggplot(Corona_Cases.world,aes(x=Total_confirmed_cases,y=Total_confirmed_deaths,col=Country.Region))+
geom_point()+
geom_line()+
default_theme+
scale_color_brewer(type = "qualitative",palette = "Paired")+
ggtitle(paste("Log10 cases over time,",timestamp_plot.world))+
theme(legend.position = "bottom",plot.title = element_text(size=12))+
ggtitle(timestamp_plot.world))

### Write polots
write_plot(Corona_Cases.world.long.plot,wd = results_dir)
## [1] "/Users/stevensmith/Projects/coronavirus/results/Corona_Cases.world.long.plot.png"
write_plot(Corona_Cases.world.loglong.plot,wd = results_dir)
## [1] "/Users/stevensmith/Projects/coronavirus/results/Corona_Cases.world.loglong.plot.png"
write_plot(Corona_Cases.world.mortality.plot,wd = results_dir)
## Warning: Removed 100 rows containing missing values (geom_point).
## Warning: Removed 100 row(s) containing missing values (geom_path).
## [1] "/Users/stevensmith/Projects/coronavirus/results/Corona_Cases.world.mortality.plot.png"
write_plot(Corona_Cases.world.casecor.plot,wd = results_dir)
## [1] "/Users/stevensmith/Projects/coronavirus/results/Corona_Cases.world.casecor.plot.png"
##------------------------------------------
## Plot US State Data
##-----------------------------------------
baseplot.US<-ggplot(data=NULL,aes(x=Days_since_100_state,col=case_type))+
default_theme+
facet_wrap(~Province.State)+
ggtitle(paste("Log10 cases over time,",timestamp_plot.world))
Corona_Cases.US_state.long.plot<-baseplot.US+geom_point(data=Corona_Cases.US_state.long,aes(y=cases.log))
##------------------------------------------
## Plot US City Data
##-----------------------------------------
Corona_Cases.US.plotdata<-filter(Corona_Cases.US_state,Province.State %in% c("Pennsylvania","Maryland","New York","New Jersey") &
City %in% c("Bucks","Baltimore City", "New York","Burlington") &
Total_confirmed_cases>0)
timestamp_plot<-paste("Most recent date for which data available:",max(Corona_Cases.US.plotdata$Date))#timestamp(quiet = T,prefix = "Updated ",suffix = " (EST)")
city_colors<-c("Bucks"='#beaed4',"Baltimore City"='#386cb0', "New York"='#7fc97f',"Burlington"='#fdc086')
##/////////////////////////
### Plot death vs total case correlation
(Corona_Cases.city.loglong.plot<-ggplot(melt(Corona_Cases.US.plotdata,measure.vars = c("Total_confirmed_cases.log","Total_confirmed_deaths.log"),variable.name = "case_type",value.name = "cases"),aes(x=Date,y=cases,col=City,pch=case_type))+
geom_point(size=4)+
geom_line()+
default_theme+
#facet_wrap(~case_type)+
ggtitle(paste("Log10 total and death cases over time,",timestamp_plot))+
theme(legend.position = "bottom",plot.title = element_text(size=12))+
scale_color_manual(values = city_colors))

(Corona_Cases.city.long.plot<-ggplot(filter(Corona_Cases.US.plotdata,Province.State !="New York"),aes(x=Date,y=Total_confirmed_cases,col=City))+
geom_point(size=4)+
geom_line()+
default_theme+
facet_grid(~Province.State,scales = "free_y")+
ggtitle(paste("MD, PA, NJ total cases over time,",timestamp_plot))+
theme(legend.position = "bottom",plot.title = element_text(size=12))+
scale_color_manual(values = city_colors))

(Corona_Cases.city.mortality.plot<-ggplot(Corona_Cases.US.plotdata,aes(x=Date,y=mortality_rate,col=City))+
geom_point(size=3)+
geom_line(size=2)+
default_theme+
ggtitle(paste("Mortality rate (deaths/total) over time,",timestamp_plot))+
theme(legend.position = "bottom",plot.title = element_text(size=12))+
scale_color_manual(values = city_colors))

(Corona_Cases.city.casecor.plot<-ggplot(filter(Corona_Cases.US.plotdata,Province.State !="New York"),aes(x=Total_confirmed_deaths,y=Total_confirmed_cases,col=City))+
geom_point(size=3)+
geom_line(size=2)+
default_theme+
ggtitle(paste("Correlation of death vs total cases,",timestamp_plot))+
theme(legend.position = "bottom",plot.title = element_text(size=12))+
scale_color_manual(values = city_colors))

(Corona_Cases.city.long.normalized.plot<-ggplot(filter(Corona_Cases.US.plotdata,Province.State !="New York"),aes(x=Date,y=Total_confirmed_cases.per100,col=City))+
geom_point(size=4)+
geom_line()+
default_theme+
facet_grid(~Province.State)+
ggtitle(paste("MD, PA, NJ total cases over time per 100 people,",timestamp_plot))+
theme(legend.position = "bottom",plot.title = element_text(size=12))+
scale_color_manual(values = city_colors))

write_plot(Corona_Cases.city.long.plot,wd = results_dir_custom)
## [1] "/Users/stevensmith/Projects/coronavirus/results/custom/Corona_Cases.city.long.plot.png"
write_plot(Corona_Cases.city.loglong.plot,wd = results_dir_custom)
## [1] "/Users/stevensmith/Projects/coronavirus/results/custom/Corona_Cases.city.loglong.plot.png"
write_plot(Corona_Cases.city.mortality.plot,wd = results_dir_custom)
## [1] "/Users/stevensmith/Projects/coronavirus/results/custom/Corona_Cases.city.mortality.plot.png"
write_plot(Corona_Cases.city.casecor.plot,wd = results_dir_custom)
## [1] "/Users/stevensmith/Projects/coronavirus/results/custom/Corona_Cases.city.casecor.plot.png"
write_plot(Corona_Cases.city.long.normalized.plot,wd = results_dir_custom)
## [1] "/Users/stevensmith/Projects/coronavirus/results/custom/Corona_Cases.city.long.normalized.plot.png"
Q1b what is the model
Fit the cases to a linear model 1. Find time at which the case vs date becomes linear in each plot
2. Fit linear model for each city
# What is the predict # of cases for the next few days?
# How is the model performing historically?
Corona_Cases.US_state.summary<-ddply(Corona_Cases.US_state,
c("Province.State","Date"),
summarise,
Total_confirmed_cases_perstate=sum(Total_confirmed_cases)) %>%
filter(Total_confirmed_cases_perstate>100)
# Compute the states with the most cases (for coloring and for linear model)
top_states_totals<-head(ddply(Corona_Cases.US_state.summary,c("Province.State"),summarise, Total_confirmed_cases_perstate.max=max(Total_confirmed_cases_perstate)) %>% arrange(-Total_confirmed_cases_perstate.max),n=max_colors)
kable(top_states_totals,caption = "Top 12 States, total count ")
Top 12 States, total count
| New York |
253060 |
| New Jersey |
88722 |
| Massachusetts |
38077 |
| Pennsylvania |
33914 |
| California |
33686 |
| Michigan |
32000 |
| Illinois |
31513 |
| Florida |
27059 |
| Louisiana |
24523 |
| Connecticut |
19815 |
| Texas |
19751 |
| Georgia |
19407 |
top_states<-top_states_totals$Province.State
# Manually fix states so that Maryland is switched out for New York
top_states_modified<-c(top_states[top_states !="New York"],"Maryland")
# Plot with all states:
(Corona_Cases.US_state.summary.plot<-ggplot(Corona_Cases.US_state.summary,aes(x=Date,y=Total_confirmed_cases_perstate))+
geom_point()+
geom_point(data=filter(Corona_Cases.US_state.summary,Province.State %in% top_states),aes(col=Province.State))+
scale_color_brewer(type = "qualitative",palette = "Paired")+
default_theme+
theme(axis.text.x = element_text(angle=45,hjust=1),legend.position = "bottom")+
ggtitle("Total confirmed cases per state, top 12 colored")+
scale_x_date(date_breaks="1 week",date_minor_breaks="1 day"))

##------------------------------------------
## Fit linear model to time vs total cases
##-----------------------------------------
# First, find the date at which each state's cases vs time becomes lienar (2nd derivative is about 0)
li<-ddply(Corona_Cases.US_state.summary,c("Province.State"),find_linear_index)
# Compute linear model for each state starting at the point at which data becomes linear
for(i in 1:nrow(li)){
Province.State.i<-li[i,"Province.State"]
date.i<-li[i,"V1"]
data.i<-filter(Corona_Cases.US_state.summary,Province.State==Province.State.i & as.numeric(Date) >= date.i)
model_results<-lm(data.i,formula = Total_confirmed_cases_perstate~Date)
slope<-model_results$coefficients[2]
intercept<-model_results$coefficients[1]
li[li$Province.State==Province.State.i,"m"]<-slope
li[li$Province.State==Province.State.i,"b"]<-intercept
}
# Compute top state case load with fitted model
(Corona_Cases.US_state.lm.plot<-ggplot(filter(Corona_Cases.US_state.summary,Province.State %in% top_states_modified ))+
geom_abline(data=filter(li,Province.State %in% top_states_modified),
aes(slope = m,intercept = b,col=Province.State),lty=2)+
geom_point(aes(x=Date,y=Total_confirmed_cases_perstate,col=Province.State))+
scale_color_brewer(type = "qualitative",palette = "Paired")+
default_theme+
theme(axis.text.x = element_text(angle=45,hjust=1),legend.position = "bottom")+
ggtitle("Total confirmed cases per state, top 12 colored")+
scale_x_date(date_breaks="1 week",date_minor_breaks="1 day"))

##------------------------------------------
## Predict the number of total cases over the next week
##-----------------------------------------
predicted_days<-c(0,1,2,3,7)+as.numeric(as.Date("2020-04-20"))
predicted_days_df<-data.frame(matrix(ncol=3))
names(predicted_days_df)<-c("Province.State","days","Total_confirmed_cases_perstate")
# USe model parameters to estiamte case loads
for(state.i in top_states_modified){
predicted_days_df<-rbind(predicted_days_df,
data.frame(Province.State=state.i,
prediction_model(m = li[li$Province.State==state.i,"m"],
b =li[li$Province.State==state.i,"b"] ,
days =predicted_days )))
}
predicted_days_df$Date<-as.Date(predicted_days_df$days,origin="1970-01-01")
kable(predicted_days_df,caption = "Predicted total cases over the next week for selected states")
Predicted total cases over the next week for selected states
| NA |
NA |
NA |
NA |
| New Jersey |
18372 |
88643.21 |
2020-04-20 |
| New Jersey |
18373 |
92062.61 |
2020-04-21 |
| New Jersey |
18374 |
95482.01 |
2020-04-22 |
| New Jersey |
18375 |
98901.41 |
2020-04-23 |
| New Jersey |
18379 |
112579.01 |
2020-04-27 |
| Massachusetts |
18372 |
34300.92 |
2020-04-20 |
| Massachusetts |
18373 |
35507.08 |
2020-04-21 |
| Massachusetts |
18374 |
36713.24 |
2020-04-22 |
| Massachusetts |
18375 |
37919.40 |
2020-04-23 |
| Massachusetts |
18379 |
42744.05 |
2020-04-27 |
| Pennsylvania |
18372 |
32204.77 |
2020-04-20 |
| Pennsylvania |
18373 |
33431.39 |
2020-04-21 |
| Pennsylvania |
18374 |
34658.01 |
2020-04-22 |
| Pennsylvania |
18375 |
35884.62 |
2020-04-23 |
| Pennsylvania |
18379 |
40791.10 |
2020-04-27 |
| California |
18372 |
31015.97 |
2020-04-20 |
| California |
18373 |
32041.54 |
2020-04-21 |
| California |
18374 |
33067.11 |
2020-04-22 |
| California |
18375 |
34092.69 |
2020-04-23 |
| California |
18379 |
38194.98 |
2020-04-27 |
| Michigan |
18372 |
33714.44 |
2020-04-20 |
| Michigan |
18373 |
34937.87 |
2020-04-21 |
| Michigan |
18374 |
36161.30 |
2020-04-22 |
| Michigan |
18375 |
37384.73 |
2020-04-23 |
| Michigan |
18379 |
42278.44 |
2020-04-27 |
| Illinois |
18372 |
28503.01 |
2020-04-20 |
| Illinois |
18373 |
29504.79 |
2020-04-21 |
| Illinois |
18374 |
30506.57 |
2020-04-22 |
| Illinois |
18375 |
31508.34 |
2020-04-23 |
| Illinois |
18379 |
35515.46 |
2020-04-27 |
| Florida |
18372 |
26436.11 |
2020-04-20 |
| Florida |
18373 |
27341.82 |
2020-04-21 |
| Florida |
18374 |
28247.54 |
2020-04-22 |
| Florida |
18375 |
29153.25 |
2020-04-23 |
| Florida |
18379 |
32776.10 |
2020-04-27 |
| Louisiana |
18372 |
27020.92 |
2020-04-20 |
| Louisiana |
18373 |
28006.02 |
2020-04-21 |
| Louisiana |
18374 |
28991.13 |
2020-04-22 |
| Louisiana |
18375 |
29976.23 |
2020-04-23 |
| Louisiana |
18379 |
33916.65 |
2020-04-27 |
| Connecticut |
18372 |
18525.25 |
2020-04-20 |
| Connecticut |
18373 |
19295.74 |
2020-04-21 |
| Connecticut |
18374 |
20066.24 |
2020-04-22 |
| Connecticut |
18375 |
20836.74 |
2020-04-23 |
| Connecticut |
18379 |
23918.73 |
2020-04-27 |
| Texas |
18372 |
19177.39 |
2020-04-20 |
| Texas |
18373 |
19896.28 |
2020-04-21 |
| Texas |
18374 |
20615.16 |
2020-04-22 |
| Texas |
18375 |
21334.04 |
2020-04-23 |
| Texas |
18379 |
24209.57 |
2020-04-27 |
| Georgia |
18372 |
18076.99 |
2020-04-20 |
| Georgia |
18373 |
18740.31 |
2020-04-21 |
| Georgia |
18374 |
19403.63 |
2020-04-22 |
| Georgia |
18375 |
20066.95 |
2020-04-23 |
| Georgia |
18379 |
22720.23 |
2020-04-27 |
| Maryland |
18372 |
11876.61 |
2020-04-20 |
| Maryland |
18373 |
12330.74 |
2020-04-21 |
| Maryland |
18374 |
12784.87 |
2020-04-22 |
| Maryland |
18375 |
13239.00 |
2020-04-23 |
| Maryland |
18379 |
15055.53 |
2020-04-27 |
##------------------------------------------
## Write plots
##-----------------------------------------
write_plot(Corona_Cases.US_state.summary.plot,wd = results_dir)
## [1] "/Users/stevensmith/Projects/coronavirus/results/Corona_Cases.US_state.summary.plot.png"
write_plot(Corona_Cases.US_state.lm.plot,wd = results_dir)
## [1] "/Users/stevensmith/Projects/coronavirus/results/Corona_Cases.US_state.lm.plot.png"
##------------------------------------------
## Write tables
##-----------------------------------------
write.csv(predicted_days_df,file = paste0(results_dir,"predicted_total_cases_days.csv"),quote = F,row.names = F)
Q2: What is the predicted number of cases?
What is the prediction of COVID-19 based on model thus far? Additional questions:
WHy did it take to day 40 to start a log linear trend? How long will it be till x number of cases? When will the plateu happen? Are any effects noticed with social distancing? Delays
##------------------------------------------
## Prediction and Prediction Accuracy
##------------------------------------------
today_num<-max(Corona_Cases.US$Days_since_100)
predicted_days<-today_num+c(1,2,3,7)
#mods = dlply(mydf, .(x3), lm, formula = y ~ x1 + x2)
#today:
Corona_Cases.US[Corona_Cases.US$Days_since_100==(today_num-1),]
Corona_Cases.US[Corona_Cases.US$Days_since_100==today_num,]
Corona_Cases.US$type<-"Historical"
#prediction_values<-prediction_model(m=slope,b=intercept,days = predicted_days)$Total_confirmed_cases
#histoical_model<-data.frame(date=today_num,m=slope,b=intercept)
# model for previous y days
#historical_model_predictions<-data.frame(day_x=NULL,Days_since_100=NULL,Total_confirmed_cases=NULL,Total_confirmed_cases.log=NULL)
# for(i in c(1,2,3,4,5,6,7,8,9,10)){
# #i<-1
# day_x<-today_num-i # 1, 2, 3, 4
# day_x_nextweek<-day_x+c(1,2,3)
# model_fit_x<-lm(data = filter(Corona_Cases.US.case100,Days_since_100 < day_x),formula = Total_confirmed_cases.log~Days_since_100)
# prediction_day_x_nextweek<-prediction_model(m = model_fit_x$coefficients[2],b = model_fit_x$coefficients[1],days = day_x_nextweek)
# prediction_day_x_nextweek$type<-"Predicted"
# acutal_day_x_nextweek<-filter(Corona_Cases.US,Days_since_100 %in% day_x_nextweek) %>% select(c(Days_since_100,Total_confirmed_cases,Total_confirmed_cases.log))
# acutal_day_x_nextweek$type<-"Historical"
# historical_model_predictions.i<-data.frame(day_x=day_x,rbind(acutal_day_x_nextweek,prediction_day_x_nextweek))
# historical_model_predictions<-rbind(historical_model_predictions.i,historical_model_predictions)
# }
#historical_model_predictions.withHx<-rbind.fill(historical_model_predictions,data.frame(Corona_Cases.US,type="Historical"))
#historical_model_predictions.withHx$Total_confirmed_cases.log2<-log(historical_model_predictions.withHx$Total_confirmed_cases,2)
(historical_model_predictions.plot<-ggplot(historical_model_predictions.withHx,aes(x=Days_since_100,y=Total_confirmed_cases.log,col=type))+
geom_point(size=3)+
default_theme+
theme(legend.position = "bottom")+
#geom_abline(slope = slope,intercept =intercept,lty=2)+
#facet_wrap(~case_type,ncol=1)+
scale_color_manual(values = c("Historical"="#377eb8","Predicted"="#e41a1c")))
write_plot(historical_model_predictions.plot,wd=results_dir)
Q3: What is the effect on social distancing, descreased mobility on case load?
Load data from Google which compoutes % change in user mobility relative to baseline for * Recreation
* Workplace
* Residence
* Park
* Grocery
Data from https://www.google.com/covid19/mobility/
# See pre-processing section for script on gathering mobility data
# UNDER DEVELOPMENT
mobility<-read.csv("/Users/stevensmith/Projects/MIT_COVID19/mobility.csv",header = T,stringsAsFactors = F)
#mobility$Retail_Recreation<-as.numeric(sub(mobility$Retail_Recreation,pattern = "%",replacement = ""))
#mobility$Workplace<-as.numeric(sub(mobility$Workplace,pattern = "%",replacement = ""))
#mobility$Residential<-as.numeric(sub(mobility$Residential,pattern = "%",replacement = ""))
##------------------------------------------
## Show relationship between mobility and caseload
##------------------------------------------
mobility$County<-gsub(mobility$County,pattern = " County",replacement = "")
Corona_Cases.US_state.mobility<-merge(Corona_Cases.US_state,plyr::rename(mobility,c("State"="Province.State","County"="City")))
#Corona_Cases.US_state.tmp<-merge(metadata,Corona_Cases.US_state.tmp)
# Needs to happen upsteam, see todos
#Corona_Cases.US_state.tmp$Total_confirmed_cases.perperson<-Corona_Cases.US_state.tmp$Total_confirmed_cases/as.numeric(Corona_Cases.US_state.tmp$Population)
mobility_measures<-c("Retail_Recreation","Grocery_Pharmacy","Parks","Transit","Workplace","Residential")
plot_data<-filter(Corona_Cases.US_state.mobility, Date.numeric==max(Corona_Cases.US_state$Date.numeric) ) %>% melt(measure.vars=mobility_measures)
plot_data$value<-as.numeric(gsub(plot_data$value,pattern = "%",replacement = ""))
plot_data<-filter(plot_data,!is.na(value))
(mobility.plot<-ggplot(filter(plot_data,Province.State %in% c("Pennsylvania","Maryland","New Jersey","California","Delaware","Connecticut")),aes(y=Total_confirmed_cases.per100,x=value))+geom_point()+
facet_grid(Province.State~variable,scales = "free")+
xlab("Mobility change from baseline (%)")+
ylab(paste0("Confirmed cases per 100 people(Today)"))+
default_theme+
ggtitle("Mobility change vs cases"))

(mobility.global.plot<-ggplot(plot_data,aes(y=Total_confirmed_cases.per100,x=value))+geom_point()+
facet_wrap(~variable,scales = "free")+
xlab("Mobility change from baseline (%)")+
ylab(paste0("Confirmed cases (Today) per 100 people"))+
default_theme+
ggtitle("Mobility change vs cases"))

plot_data.permobility_summary<-ddply(plot_data,c("Province.State","variable"),summarise,cor=cor(y =Total_confirmed_cases.per100,x=value),median_change=median(x=value)) %>% arrange(-abs(cor))
kable(plot_data.permobility_summary,caption = "Ranked per-state mobility correlation with total confirmed cases")
Ranked per-state mobility correlation with total confirmed cases
| Alaska |
Transit |
-1.0000000 |
-63.0 |
| Delaware |
Retail_Recreation |
1.0000000 |
-39.5 |
| Delaware |
Grocery_Pharmacy |
1.0000000 |
-17.5 |
| Delaware |
Parks |
-1.0000000 |
20.5 |
| Delaware |
Transit |
1.0000000 |
-37.0 |
| Delaware |
Workplace |
1.0000000 |
-37.0 |
| Delaware |
Residential |
-1.0000000 |
14.0 |
| Hawaii |
Parks |
0.9855500 |
-72.0 |
| Hawaii |
Transit |
0.9682915 |
-89.0 |
| Alaska |
Residential |
0.9646018 |
13.0 |
| Utah |
Workplace |
-0.9461129 |
-37.0 |
| Vermont |
Parks |
0.9218804 |
-35.5 |
| Hawaii |
Grocery_Pharmacy |
0.9196704 |
-34.0 |
| New Hampshire |
Parks |
0.9113509 |
-20.0 |
| South Dakota |
Parks |
0.9065089 |
-26.0 |
| Connecticut |
Grocery_Pharmacy |
-0.8993428 |
-6.0 |
| Utah |
Retail_Recreation |
-0.8962660 |
-40.0 |
| Hawaii |
Retail_Recreation |
0.8606345 |
-56.0 |
| Massachusetts |
Workplace |
-0.8533862 |
-39.0 |
| Alaska |
Grocery_Pharmacy |
-0.8494172 |
-7.0 |
| Utah |
Grocery_Pharmacy |
-0.8139552 |
-4.0 |
| Connecticut |
Transit |
-0.7919371 |
-50.0 |
| Rhode Island |
Workplace |
-0.7659157 |
-39.5 |
| Utah |
Residential |
-0.7538373 |
12.0 |
| North Dakota |
Parks |
-0.7395464 |
-34.0 |
| New Mexico |
Parks |
0.7200952 |
-31.5 |
| Utah |
Transit |
-0.7198309 |
-18.0 |
| New Jersey |
Workplace |
-0.7131262 |
-44.0 |
| North Dakota |
Retail_Recreation |
-0.7054369 |
-43.5 |
| Massachusetts |
Retail_Recreation |
-0.6899492 |
-44.0 |
| California |
Retail_Recreation |
-0.6743051 |
-44.0 |
| Utah |
Parks |
-0.6706987 |
17.0 |
| California |
Workplace |
-0.6698821 |
-36.0 |
| New York |
Workplace |
-0.6697008 |
-34.5 |
| Vermont |
Grocery_Pharmacy |
-0.6642784 |
-25.0 |
| Maine |
Transit |
-0.6634212 |
-50.0 |
| New Jersey |
Retail_Recreation |
-0.6475270 |
-62.5 |
| Kansas |
Parks |
0.6423213 |
72.0 |
| Connecticut |
Residential |
0.6336191 |
14.0 |
| Maryland |
Workplace |
-0.6306494 |
-35.0 |
| New York |
Retail_Recreation |
-0.6285045 |
-46.0 |
| Rhode Island |
Residential |
-0.6194063 |
18.5 |
| California |
Residential |
0.6123256 |
14.0 |
| California |
Grocery_Pharmacy |
-0.6060930 |
-12.0 |
| Montana |
Workplace |
-0.5901126 |
-40.5 |
| Nevada |
Transit |
-0.5848795 |
-20.0 |
| California |
Transit |
-0.5835718 |
-42.0 |
| Massachusetts |
Grocery_Pharmacy |
-0.5814299 |
-7.0 |
| Rhode Island |
Retail_Recreation |
-0.5704169 |
-45.0 |
| Alaska |
Workplace |
-0.5679702 |
-34.0 |
| Connecticut |
Workplace |
-0.5565283 |
-39.0 |
| West Virginia |
Parks |
0.5553745 |
-27.0 |
| New Jersey |
Parks |
-0.5357858 |
-6.0 |
| Montana |
Transit |
-0.5330404 |
-41.0 |
| Maine |
Workplace |
-0.5306826 |
-30.0 |
| Nevada |
Retail_Recreation |
-0.5168522 |
-43.0 |
| Montana |
Retail_Recreation |
-0.5165435 |
-51.0 |
| Idaho |
Workplace |
-0.5154350 |
-29.5 |
| Minnesota |
Parks |
0.4996525 |
-10.0 |
| Connecticut |
Retail_Recreation |
-0.4823740 |
-45.0 |
| Montana |
Parks |
-0.4797368 |
-58.0 |
| Kansas |
Grocery_Pharmacy |
-0.4790038 |
-14.0 |
| New Jersey |
Grocery_Pharmacy |
-0.4713537 |
2.5 |
| Arizona |
Grocery_Pharmacy |
-0.4711334 |
-15.0 |
| Maine |
Parks |
0.4705601 |
-31.0 |
| Montana |
Residential |
0.4580929 |
14.0 |
| Idaho |
Transit |
-0.4575232 |
-30.0 |
| Rhode Island |
Parks |
0.4514869 |
52.0 |
| Vermont |
Residential |
0.4431676 |
11.5 |
| Pennsylvania |
Workplace |
-0.4418295 |
-36.0 |
| Arkansas |
Parks |
-0.4362584 |
-12.0 |
| Massachusetts |
Transit |
-0.4334904 |
-45.0 |
| New Mexico |
Residential |
0.4308095 |
13.5 |
| New Jersey |
Transit |
-0.4283798 |
-50.5 |
| Idaho |
Grocery_Pharmacy |
-0.4261301 |
-4.0 |
| New York |
Parks |
0.4257291 |
20.0 |
| New York |
Transit |
-0.4242627 |
-48.0 |
| Montana |
Grocery_Pharmacy |
-0.4058904 |
-16.0 |
| Pennsylvania |
Retail_Recreation |
-0.3963362 |
-45.0 |
| Michigan |
Workplace |
-0.3963036 |
-40.0 |
| Virginia |
Retail_Recreation |
-0.3915238 |
-35.0 |
| Colorado |
Residential |
0.3868972 |
14.0 |
| Vermont |
Retail_Recreation |
0.3809664 |
-57.0 |
| Illinois |
Transit |
-0.3809151 |
-31.0 |
| Idaho |
Retail_Recreation |
-0.3791350 |
-41.0 |
| Florida |
Parks |
-0.3772689 |
-43.0 |
| Maryland |
Grocery_Pharmacy |
-0.3742974 |
-10.0 |
| New Mexico |
Grocery_Pharmacy |
-0.3726445 |
-11.5 |
| Colorado |
Workplace |
-0.3679019 |
-39.0 |
| Virginia |
Transit |
-0.3677731 |
-33.0 |
| Alabama |
Workplace |
-0.3677454 |
-29.0 |
| Wisconsin |
Transit |
-0.3662530 |
-23.5 |
| Arizona |
Transit |
0.3587083 |
-38.0 |
| Alaska |
Retail_Recreation |
0.3539900 |
-39.0 |
| New Mexico |
Retail_Recreation |
-0.3477916 |
-42.5 |
| Oregon |
Parks |
0.3463564 |
16.5 |
| North Dakota |
Grocery_Pharmacy |
-0.3436992 |
-9.5 |
| Arizona |
Residential |
0.3405418 |
13.0 |
| Maryland |
Retail_Recreation |
-0.3353905 |
-39.0 |
| Florida |
Residential |
0.3343833 |
14.0 |
| Minnesota |
Transit |
-0.3334180 |
-28.5 |
| Rhode Island |
Grocery_Pharmacy |
0.3330692 |
-7.5 |
| Colorado |
Retail_Recreation |
-0.3311956 |
-44.0 |
| Colorado |
Parks |
-0.3266929 |
2.0 |
| California |
Parks |
-0.3255591 |
-38.0 |
| Arizona |
Retail_Recreation |
-0.3228722 |
-42.5 |
| South Dakota |
Transit |
-0.3211172 |
-40.0 |
| Washington |
Transit |
-0.3177385 |
-33.5 |
| Mississippi |
Parks |
0.3162382 |
-25.0 |
| Arkansas |
Retail_Recreation |
-0.3157899 |
-30.0 |
| Texas |
Transit |
0.3112692 |
-42.0 |
| North Dakota |
Workplace |
0.3111708 |
-33.5 |
| Idaho |
Parks |
0.3084098 |
-22.0 |
| Virginia |
Workplace |
-0.3054002 |
-32.0 |
| Florida |
Transit |
-0.3048705 |
-49.0 |
| Colorado |
Grocery_Pharmacy |
-0.3045091 |
-17.0 |
| Illinois |
Workplace |
-0.2985202 |
-30.0 |
| Mississippi |
Grocery_Pharmacy |
-0.2968257 |
-8.0 |
| New York |
Grocery_Pharmacy |
-0.2931932 |
8.0 |
| Virginia |
Grocery_Pharmacy |
-0.2866612 |
-8.0 |
| Pennsylvania |
Parks |
0.2837927 |
13.0 |
| New Hampshire |
Residential |
-0.2819528 |
14.0 |
| Colorado |
Transit |
-0.2809509 |
-36.0 |
| Maine |
Grocery_Pharmacy |
-0.2802607 |
-13.0 |
| New Jersey |
Residential |
0.2782457 |
18.0 |
| Oregon |
Residential |
0.2721056 |
10.5 |
| Kentucky |
Parks |
0.2715847 |
28.5 |
| Arkansas |
Residential |
0.2657737 |
12.0 |
| Texas |
Residential |
-0.2557815 |
15.0 |
| Georgia |
Grocery_Pharmacy |
-0.2552937 |
-10.0 |
| Florida |
Workplace |
-0.2524014 |
-33.0 |
| Rhode Island |
Transit |
-0.2516145 |
-56.0 |
| Hawaii |
Residential |
-0.2498230 |
19.0 |
| New Hampshire |
Grocery_Pharmacy |
-0.2455163 |
-6.0 |
| Iowa |
Workplace |
-0.2453980 |
-29.0 |
| Massachusetts |
Residential |
0.2439284 |
15.0 |
| Tennessee |
Retail_Recreation |
-0.2408655 |
-30.0 |
| Indiana |
Grocery_Pharmacy |
-0.2396749 |
-5.5 |
| Maryland |
Residential |
0.2375251 |
15.0 |
| Michigan |
Grocery_Pharmacy |
-0.2374533 |
-11.0 |
| Nebraska |
Grocery_Pharmacy |
-0.2368763 |
0.0 |
| Maine |
Retail_Recreation |
-0.2351854 |
-42.0 |
| West Virginia |
Grocery_Pharmacy |
-0.2321920 |
-6.0 |
| Texas |
Parks |
0.2321367 |
-42.0 |
| Kansas |
Retail_Recreation |
-0.2294313 |
-39.0 |
| North Dakota |
Residential |
0.2292251 |
17.0 |
| Pennsylvania |
Grocery_Pharmacy |
-0.2271322 |
-6.0 |
| Oklahoma |
Grocery_Pharmacy |
0.2181002 |
-0.5 |
| South Carolina |
Residential |
0.2177518 |
12.0 |
| Georgia |
Retail_Recreation |
-0.2171438 |
-41.0 |
| Alabama |
Residential |
0.2153180 |
11.0 |
| Wisconsin |
Parks |
0.2143328 |
51.5 |
| Virginia |
Residential |
0.2088566 |
14.0 |
| Georgia |
Workplace |
-0.2064842 |
-33.5 |
| Michigan |
Retail_Recreation |
-0.2062983 |
-53.0 |
| Washington |
Workplace |
-0.2054725 |
-38.0 |
| Alabama |
Grocery_Pharmacy |
-0.2053822 |
-2.0 |
| Alabama |
Transit |
-0.2042660 |
-36.5 |
| West Virginia |
Retail_Recreation |
0.2024442 |
-38.5 |
| Iowa |
Residential |
-0.2001295 |
13.0 |
| Washington |
Parks |
0.1982925 |
-3.5 |
| Nevada |
Residential |
0.1979938 |
17.0 |
| Oklahoma |
Residential |
0.1973111 |
15.0 |
| South Dakota |
Retail_Recreation |
-0.1962572 |
-38.5 |
| South Carolina |
Workplace |
0.1954623 |
-30.0 |
| Ohio |
Transit |
0.1946730 |
-28.0 |
| Alabama |
Parks |
0.1944637 |
-1.0 |
| North Carolina |
Retail_Recreation |
-0.1883056 |
-33.0 |
| Michigan |
Parks |
0.1872264 |
30.0 |
| Tennessee |
Grocery_Pharmacy |
-0.1865674 |
6.0 |
| Kentucky |
Workplace |
-0.1863435 |
-35.0 |
| Tennessee |
Residential |
0.1854715 |
11.5 |
| West Virginia |
Workplace |
0.1850232 |
-32.5 |
| Nebraska |
Residential |
-0.1834972 |
14.0 |
| Wisconsin |
Workplace |
-0.1806554 |
-31.0 |
| New Hampshire |
Transit |
-0.1800090 |
-57.0 |
| Arizona |
Workplace |
-0.1790383 |
-35.0 |
| Illinois |
Residential |
0.1777245 |
14.0 |
| Texas |
Workplace |
0.1725234 |
-31.0 |
| Oklahoma |
Workplace |
-0.1709318 |
-30.5 |
| South Carolina |
Parks |
-0.1691941 |
-23.0 |
| Arkansas |
Workplace |
-0.1686989 |
-26.0 |
| New Hampshire |
Retail_Recreation |
-0.1674223 |
-41.0 |
| Nevada |
Workplace |
-0.1662423 |
-40.0 |
| South Carolina |
Retail_Recreation |
-0.1660328 |
-35.0 |
| Missouri |
Transit |
-0.1630578 |
-23.0 |
| North Carolina |
Transit |
0.1588124 |
-32.0 |
| Oklahoma |
Retail_Recreation |
0.1583250 |
-31.0 |
| South Dakota |
Grocery_Pharmacy |
0.1553864 |
-9.0 |
| Indiana |
Retail_Recreation |
-0.1550508 |
-38.0 |
| Oregon |
Grocery_Pharmacy |
0.1532142 |
-7.0 |
| Pennsylvania |
Transit |
-0.1519722 |
-41.5 |
| Florida |
Grocery_Pharmacy |
-0.1514709 |
-14.0 |
| Tennessee |
Workplace |
-0.1483637 |
-31.0 |
| Wisconsin |
Residential |
-0.1478320 |
14.0 |
| Wisconsin |
Grocery_Pharmacy |
0.1428404 |
-1.5 |
| Maine |
Residential |
-0.1386968 |
11.0 |
| Georgia |
Residential |
-0.1365163 |
13.0 |
| Oklahoma |
Parks |
-0.1362105 |
-18.5 |
| Idaho |
Residential |
-0.1345555 |
11.0 |
| Nebraska |
Transit |
0.1327007 |
-11.5 |
| Illinois |
Retail_Recreation |
-0.1289100 |
-40.0 |
| Arizona |
Parks |
0.1246325 |
-44.5 |
| Nebraska |
Workplace |
0.1243324 |
-32.5 |
| Florida |
Retail_Recreation |
-0.1240353 |
-43.0 |
| Pennsylvania |
Residential |
0.1237324 |
15.0 |
| Minnesota |
Retail_Recreation |
0.1223654 |
-40.5 |
| Kentucky |
Residential |
0.1187336 |
12.0 |
| Ohio |
Residential |
0.1181871 |
14.0 |
| North Carolina |
Parks |
-0.1168981 |
7.0 |
| Vermont |
Workplace |
-0.1127410 |
-43.0 |
| Tennessee |
Parks |
0.1087016 |
10.5 |
| Ohio |
Retail_Recreation |
0.1038262 |
-36.0 |
| Virginia |
Parks |
0.1037221 |
6.0 |
| Mississippi |
Workplace |
-0.1028413 |
-33.0 |
| Washington |
Residential |
0.0939541 |
13.0 |
| Illinois |
Grocery_Pharmacy |
-0.0935639 |
2.0 |
| Wisconsin |
Retail_Recreation |
-0.0923467 |
-44.0 |
| Washington |
Retail_Recreation |
-0.0912974 |
-42.0 |
| Kentucky |
Transit |
0.0891570 |
-31.0 |
| New Mexico |
Transit |
0.0881449 |
-37.0 |
| Michigan |
Residential |
0.0878612 |
15.0 |
| Indiana |
Workplace |
-0.0875347 |
-34.0 |
| Illinois |
Parks |
0.0875275 |
26.5 |
| Maryland |
Parks |
0.0864502 |
27.0 |
| New Mexico |
Workplace |
-0.0864431 |
-34.0 |
| Hawaii |
Workplace |
-0.0808873 |
-46.0 |
| New York |
Residential |
0.0797617 |
17.5 |
| Connecticut |
Parks |
0.0780428 |
43.0 |
| Arkansas |
Transit |
0.0735323 |
-27.0 |
| Nebraska |
Parks |
0.0726725 |
55.5 |
| Oregon |
Transit |
-0.0723629 |
-28.0 |
| North Carolina |
Grocery_Pharmacy |
0.0716555 |
1.0 |
| Nevada |
Parks |
-0.0700445 |
-12.5 |
| Iowa |
Parks |
-0.0698225 |
28.5 |
| New Hampshire |
Workplace |
-0.0665849 |
-37.0 |
| Ohio |
Grocery_Pharmacy |
0.0643872 |
0.0 |
| Indiana |
Residential |
0.0618211 |
12.0 |
| Kansas |
Residential |
-0.0603836 |
13.0 |
| Nebraska |
Retail_Recreation |
0.0587788 |
-37.5 |
| Missouri |
Grocery_Pharmacy |
-0.0580754 |
2.0 |
| Missouri |
Retail_Recreation |
-0.0575486 |
-36.5 |
| Texas |
Retail_Recreation |
-0.0556309 |
-39.0 |
| Kansas |
Transit |
-0.0519103 |
-26.5 |
| Minnesota |
Workplace |
-0.0517544 |
-33.0 |
| Kansas |
Workplace |
-0.0482002 |
-31.5 |
| North Carolina |
Residential |
0.0478427 |
13.0 |
| Iowa |
Retail_Recreation |
-0.0469994 |
-37.0 |
| Indiana |
Parks |
-0.0469890 |
29.0 |
| Maryland |
Transit |
-0.0467283 |
-39.0 |
| South Carolina |
Transit |
-0.0457601 |
-45.0 |
| Missouri |
Workplace |
0.0443428 |
-28.5 |
| Georgia |
Transit |
-0.0427605 |
-35.0 |
| Michigan |
Transit |
0.0427330 |
-46.0 |
| Mississippi |
Retail_Recreation |
0.0427132 |
-40.0 |
| Massachusetts |
Parks |
-0.0404740 |
39.0 |
| Washington |
Grocery_Pharmacy |
-0.0398682 |
-7.0 |
| Oregon |
Workplace |
-0.0395589 |
-32.0 |
| Iowa |
Grocery_Pharmacy |
-0.0388311 |
4.0 |
| Tennessee |
Transit |
0.0383680 |
-32.0 |
| Nevada |
Grocery_Pharmacy |
-0.0381444 |
-11.0 |
| South Carolina |
Grocery_Pharmacy |
-0.0340216 |
1.0 |
| Minnesota |
Grocery_Pharmacy |
-0.0336416 |
-5.0 |
| West Virginia |
Transit |
0.0329966 |
-45.0 |
| Missouri |
Parks |
0.0289776 |
0.0 |
| Kentucky |
Retail_Recreation |
0.0274012 |
-29.0 |
| Georgia |
Parks |
-0.0273686 |
-6.0 |
| West Virginia |
Residential |
0.0262065 |
11.0 |
| Mississippi |
Residential |
0.0259534 |
13.0 |
| North Dakota |
Transit |
-0.0248354 |
-48.0 |
| South Dakota |
Residential |
0.0243392 |
15.0 |
| Minnesota |
Residential |
0.0238146 |
17.0 |
| North Carolina |
Workplace |
0.0231928 |
-31.0 |
| Vermont |
Transit |
0.0227965 |
-63.0 |
| Mississippi |
Transit |
-0.0214496 |
-38.5 |
| Alabama |
Retail_Recreation |
0.0213752 |
-39.0 |
| Ohio |
Workplace |
-0.0192275 |
-35.0 |
| Iowa |
Transit |
0.0173974 |
-25.0 |
| Oklahoma |
Transit |
0.0172117 |
-26.0 |
| Arkansas |
Grocery_Pharmacy |
-0.0170389 |
3.5 |
| Kentucky |
Grocery_Pharmacy |
0.0162747 |
4.0 |
| Indiana |
Transit |
-0.0145231 |
-29.0 |
| Texas |
Grocery_Pharmacy |
-0.0144226 |
-13.0 |
| Ohio |
Parks |
0.0135843 |
67.5 |
| South Dakota |
Workplace |
0.0100289 |
-35.0 |
| Missouri |
Residential |
0.0081337 |
13.0 |
| Oregon |
Retail_Recreation |
0.0075100 |
-41.0 |
| Alaska |
Parks |
NA |
29.0 |
| District of Columbia |
Retail_Recreation |
NA |
-69.0 |
| District of Columbia |
Grocery_Pharmacy |
NA |
-28.0 |
| District of Columbia |
Parks |
NA |
-65.0 |
| District of Columbia |
Transit |
NA |
-69.0 |
| District of Columbia |
Workplace |
NA |
-48.0 |
| District of Columbia |
Residential |
NA |
17.0 |
# sanity check
ggplot(filter(plot_data,Province.State %in% c("Pennsylvania","Maryland","New Jersey","California","Delaware","Connecticut")),aes(x=Total_confirmed_cases.per100,fill=variable))+geom_histogram()+
facet_grid(~Province.State)+
default_theme+
theme(legend.position = "bottom")
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

write_plot(mobility.plot,wd = results_dir)
## [1] "/Users/stevensmith/Projects/coronavirus/results/mobility.plot.png"
write_plot(mobility.global.plot,wd = results_dir)
## [1] "/Users/stevensmith/Projects/coronavirus/results/mobility.global.plot.png"
(plot_data.permobility_summary.plot<-ggplot(plot_data.permobility_summary,aes(x=variable,y=median_change))+
geom_jitter(size=2,width=.2)+
#geom_jitter(data=plot_data.permobility_summary %>% arrange(-abs(median_change)) %>% head(n=15),aes(col=Province.State),size=2,width=.2)+
default_theme+
ggtitle("Per-Sate Median Change in Mobility")+
xlab("Mobility Meaure")+
ylab("Median Change from Baseline"))

write_plot(plot_data.permobility_summary.plot,wd = results_dir)
## [1] "/Users/stevensmith/Projects/coronavirus/results/plot_data.permobility_summary.plot.png"